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Original  Article 

Validating  the  Use  of  ICD-9-CM  Codes  to 
Evaluate  Gestational  Age  and  Birth  Weight 

John  P.  Barrett ,  MD,  MS ,  MPH d;  Carter  /.  Sevick,  MSb;  Ava  Marie  5.  Conlin ,  DO ,  MPHb ;  Gia  R.  Gumbs,  MPHb; 

Sydney  Lee ,  MSb ;  Diane  R  Martin ,  PhDc;  Tyler  C.  Smith ,  MS ,  PhD b 

Abstract:  Background:  Efforts  to  reduce  preterm  and  low-weight  births  are  among  the  leading  public  health  objectives  in 
the  United  States  and  the  world.  A  necessary  component  of  any  public  health  endeavor  is  surveillance.  The  Department 
of  Defense  (DoD)  Birth  and  Infant  Health  Registry  (Registry)  uses  electronic  healthcare  utilization  data  to  assess  reproduc¬ 
tive  health  outcomes  among  military  families.  Infant  health  outcomes  are  coded  using  the  International  Classification  of 
Diseases,  9th  Revision,  Clinical  Modification  (ICD-9-CM).  The  objective  of  this  study  was  to  determine  the  accuracy  of 
using  electronically  derived  ICD-9-CM  codes  for  assessing  gestational  age  and  birth  weight  among  Registry  infants  com¬ 
pared  to  medical  records.  Methods:  The  authors  assessed  birth  outcome  agreement  by  comparing  electronic  Registry  data 
for  infants  born  at  military  treatment  facilities  (MTFs)  from  1999-2002  and  1,858  randomly  selected  birth  medical  records 
from  17  MTFs,  with  descriptive  statistics  and  measures  of  agreement,  including  the  kappa  statistic.  Results:  Of  the  1,858 
reviewed  infant  records,  1,669  were  successfully  matched  to  the  Registry  analytic  dataset  for  analyses.  Despite  small  differ¬ 
ences  in  parental  demographics,  this  investigation  established  "near  perfect"  agreement  for  the  primary  outcomes:  kappa 
of  0.83  for  preterm  and  0.87  for  low  birth  weight.  Subgroup  analyses  revealed  no  significant  differences  in  gestational  age 
and  birth-weight  agreement  based  on  the  presence  of  a  birth  defect,  military  parent  rank,  branch  of  military  service,  or 
specific  hospital  characteristics.  Conclusions:  Electronically  derived  ICD-9-CM  codes  provide  an  accurate  assessment  of 
the  gestational  age  and  low  birth  weight  reflected  in  the  birth  medical  records  of  infants  in  a  large  birth  and  infant  health 
registry.  These  findings  support  the  integrity  of  Registry  data  for  investigations  assessing  preterm  and  low-weight  births 
among  U.S.  service  member  families. 

Key  words:  birth  registry ,  data  validation ,  estimated  gestational  age ,  preterm  birth ,  birth  weight 


Introduction 

Preterm  and  low  birth  weight  infants  are  at  high  risk 
for  neonatal  death  and  long-term  health  consequences 
compared  with  full-term  and  normal-weight  infants.  In  the 
United  States,  12.8%  of  infants  are  born  preterm,  defined 
as  less  than  37  completed  weeks  of  gestation  at  birth,  and 
22.4%  of  all  infant  deaths  are  related  to  preterm  birth. 
Almost  70%  of  infant  deaths  occur  among  the  8.1%  of  infants 
of  low  birth  weight,  or  less  than  2500  grams  at  delivery.1"5 
Among  the  many  sequelae  that  disproportionately  afflict 
preterm  infants  are  neurodevelopmental  impairments,  with 
approximately  75%  of  cerebral  palsy  cases  associated  with 
early  births.6'7  As  gestational  age  and  birth  weight  increase, 
health  complications  associated  with  preterm  birth  decrease; 
however,  even  near-term  (late  preterm)  infants  (>_34  to  <  37 
weeks  estimated  gestational  age  [EGA])  are  at  risk  for  health 
problems,  including  school-age  developmental  delays  and 
disabilities.8"13  In  2005,  an  estimated  $26  billion  was  spent 
providing  health  care  in  the  United  States  to  infants  born 
preterm.1014 

Given  their  enormous  societal  burden,  efforts  to  reduce 
preterm  and  low-weight  births  are  leading  public  health 


objectives  in  the  United  States  and  the  world.10'15"17  A 
necessary  component  of  any  public  health  endeavor  is 
surveillance.  For  the  US  military,  assessing  parental  occu¬ 
pational  exposures  and  reproductive  health  outcomes  is  a 
primary  undertaking  of  the  Department  of  Defense  (DoD) 
Birth  and  Infant  Health  Registry  (Registry),  maintained  at 
the  Deployment  Health  Research  Department  at  the  Naval 
Health  Research  Center.  The  Registry  was  established  in 
1998  in  recognition  of  the  need  to  monitor  the  reproductive 
health  of  military  families.1819  The  Registry  captures  elec¬ 
tronic  International  Classification  of  Diseases,  9th  Revision, 
Clinical  Modification  (ICD-9-CM)  codes  and  other  health 
data  from  several  databases  on  infants  from  birth  to  1  year 
of  life.  To  date.  Registry  researchers  have  conducted  a 
number  of  investigational  and  analytical  protocols  focused 
primarily  on  birth  defects,  although  preterm  birth  has  also 
been  assessed  and  found  to  range  from  7.1%  to  7.6%.20"22 
The  objectives  of  this  study  were  to  assess  the  accuracy  of 
ICD-9-CM  codes  to  identify  subcategories  of  preterm  and 
low  birth  weight  outcomes  captured  in  this  large  birth  and 
infant  health  registry  compared  with  medical  records  from 
which  the  Registry  data  was  derived. 
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140  Sylvester  Road,  San  Diego,  CA.  cUniversity  of  Washington,  Department  of  Health  Services,  Seattle,  WA. 
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Methods 

Population  and  Data  Sources 

The  Department  of  Defense  (DoD)  Birth  and  Infant 
Health  Registry  was  established  in  1998  to  increase  the 
understanding  of  the  reproductive  health  effects  of  military 
service  by  providing  systematic  surveillance  of  DoD  benefi¬ 
ciary  births  and  scientifically  rigorous  research  of  infant 
health  outcomes.  Data  sources  for  the  Registry  include 
the  Defense  Manpower  Data  Center  (DMDC)  and  the 
Defense  Enrollment  Eligibility  Reporting  System  (DEERS), 
the  central  sources  for  personnel  data  for  the  DoD  commu¬ 
nity.  Military  Health  System  Data  Repository  (MDR)  data, 
also  captured  in  the  Registry,  contains  healthcare  utiliza¬ 
tion  data  based  on  International  Classification  of  Diseases, 
Ninth  Revision,  Clinical  Modification  (ICD-9-CM)  coding 
for  inpatient  and  outpatient  care  received  at  military  treat¬ 
ment  facilities  (MTF)  and  civilian  facilities.  These  electronic 
data  sources  allow  the  Registry  to  define  live  births  and 
infant  health  outcomes  through  the  first  year  of  life  among 
the  approximately  100,000  infants  born  to  military  families 
each  year. 

Infant  data  are  linked  to  the  military  parent's  (spon¬ 
sor's)  demographic  data  including  age,  race/ ethnicity,  sex, 
educational  attainment,  service  branch,  rank,  and  marital 
status.  The  Registry  captured  over  300,000  infant  births 
from  years  1999  to  2002,  with  approximately  60%  of  these 
births  occurring  in  MTFs,  and  the  remainder  occurring  in 
civilian  medical  facilities.18-20'23  Infants  are  excluded  from 
the  Registry  analytic  database  if  data  cannot  be  reliably 
linked  to  subsequent  healthcare  encounters.  An  example  is 
same-sex  multiples,  who  are  excluded  due  to  the  inability 
to  consistently  differentiate  their  initial  health  care  prior 
to  the  assignment  of  a  unique  medical  identifier.  Exclusion 
would  also  occur  if  changes  in  identifying  information  after 
the  infant's  birth  do  not  match  information  in  the  DEERS; 
or  if  for  any  reason  DoD  medical  benefits  are  discontinued 
shortly  after  birth,  such  as  when  a  military  parent  leaves  the 
service  before  the  infant's  DEERS  registration. 

Due  to  the  difficulty  in  obtaining  medical  records 
from  civilian  facilities,  infant  birth  records  were  limited 
to  infants  born  at  MTFs  from  1999-2002.  The  resulting 
medical  records,  hereafter  referred  to  as  the  validation 
sample ,  included  1,858  copies  of  medical  records,  ranging 
from  complete  records  to  limited  excerpts  of  care.  Initially 
obtained  to  validate  Registry  birth  defect  data,  the  valida¬ 
tion  sample  oversamples  birth  defects.  The  17  MTFs  from 
which  the  records  were  collected  represent  a  stratified 
random  sample  of  MTFs,  selected  to  ensure  a  mix  of  large 
and  small  facilities  in  the  United  States  and  abroad  from 
each  branch  of  military  service.  The  Registry  team  requested 
infant  birth  medical  records  from  DoD  electronic  hospital 
birth  lists  for  a  given  year  though  a  stratified  random  selec¬ 
tion,  based  on  the  presence  or  absence  of  a  birth  defect, 
without  prior  knowledge  of  whether  the  selected  infant 
records  were  ultimately  included  in  the  Registry  analytic 
file.  Ten  percent  of  birth  defect  records  and  up  to  1%  of 
non-birth  defect  records  were  requested  from  each  facility. 
The  comparison  group  included  infants  born  at  MTFs  from 
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1999-2002  and  captured  in  the  DoD  Birth  and  Infant  Health 
Registry.  For  appropriate  comparison,  this  data  was  limited 
to  that  contained  in  records  from  MTF  care  and  excluded 
all  care  received  at  civilian  facilities.  This  group  is  hereafter 
referred  to  as  the  Registry  sample. 

Outcomes 

Gestational  age  and  birth  weight  were  defined  using 
ICD-9-CM  codes.  765.0x  and  765.1  x  represent  extreme 
preterm  (<28  weeks  and/or  <1000  grams)  and  "other 
preterm  infants,"  (>28  weeks  and  <37  completed  weeks 
EGA),  respectively.  Code  764. xx  refers  to  slow  fetal  growth 
and  malnutrition.  For  low  birth  weight,  only  the  fifth  digit 
on  ICD-9-CM  codes  764. xx  and  765.xx  was  used,  as  the 
fourth  digit  does  not  specifically  refer  to  birth  weight.  If  an 
electronic  record  lacked  any  of  the  above-mentioned  codes, 
full-term  or  normal  birth  weight  was  assumed.  If  multiple 
codes  were  listed,  the  code  indicating  the  shortest  EGA 
or  lower  birth  weight  was  used.  Of  note,  the  code  765.2x 
(weeks  of  gestation)  was  introduced  in  fiscal  year  2003  to 
indicate  specific  EGA  ranges  and  only  applies  to  infants 
born  on  or  after  October  1,  2002.  A  total  of  109  infants  in  the 
validation  sample  were  born  after  this  date;  however,  only 
records  for  4  of  these  infants  used  the  new  ICD-9-CM  codes. 
These  infants  were  retained  in  this  analysis  and  classified  to 
the  appropriate  category  according  to  pre-fiscal  year  2003 
ICD-9-CM  code  criteria. 

Data  Extraction  and  Matching 

Data  extraction  from  validation  sample  records  and 
matching  of  this  information  to  Registry  sample  data  was 
conducted  May-December  2009.  For  this  process,  a  data 
extraction  sheet  was  generated  and  information  collected 
from  birth  medical  records,  which  included  personal 
identifying  information  and  demographic  data  (hospital 
identification  number,  military  parent-sponsor  Social 
Security  number,  name,  sex,  and  date  of  birth),  and  other 
data  of  interest  (eg,  twin  and  higher  order  births,  known 
perinatal  death).  EGA  and  birth-weight  information  were 
obtained  from  the  infant  medical  record,  specifically  the 
newborn  record  data  sheet/ profile,  clinical  record,  or  admis¬ 
sion/discharge  notes,  which  include  maternal  information, 
labor  and  delivery  data,  transition  period  information,  and 
a  physical  assessment  of  the  infant  at  birth,  throughout  their 
hospital  stay,  and  at  discharge.  After  record  extraction,  EGA 
and  birth-weight  data  were  converted  to  their  appropriate 
ICD-9-CM  codes  for  comparisons  with  Registry  sample 
data.  Data  accuracy  was  confirmed  twice  during  manual 
extraction  and  again  during  entry  into  an  electronic  spread¬ 
sheet.  Infant  records  were  excluded  from  analyses  if  they 
lacked  key  demographic  or  EGA  data  or  were  otherwise 
excluded  from  the  Registry  analytic  file. 

Medical  record  data  were  matched  to  the  Registry 
database  in  a  three-step  process.  The  first  matched  subjects 
from  both  sources  of  data  with  perfect  matches  for  sex, 
date  of  birth,  and  military  parent-sponsor  Social  Security 
number  (which  is  present  on  infant  records).  The  second 
was  a  re-examination  of  all  non-perfect  matching  records 
to  check  for  any  transcription  errors,  followed  by  repeating 
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(Table  1.  Infant  Characteristic  Comparisons  Between  the 
Matched  Validation  Sample  and  the  DoD  Birth  and  Infant 

Health  Registry,  1999-2002* 

Characteristic 

Sample  (%) 

Registry  (%) 

P- value 

n=1,656f 

n==  l 

Mother,  military 

23.9 

22.1 

0.069 

Military  parent  race/ethnicity 

White 

58.5 

61.5 

Black 

21.3 

19.7 

0.064 

Hispanic 

10.4 

10.1 

Other 

9.8 

8.7 

Military  parent  rank 

Enlisted 

86 

81.9 

Officer 

12.7 

16.9 

<0.001 

Unknown 

1.3 

1.2 

Maternal  age,  years 

<18 

2.3 

2.3 

19-34 

89.5 

89.5 

0.996 

>35 

8.2 

8.2 

Maternal  marital  status 

Married 

86.7 

89.3 

Unmarried 

10.9 

8.0 

<0.001 

Unknown 

2.4 

2.8 

Military  parent  service  branch 

Army 

41.1 

40.7 

Navy 

38.9 

25.3 

Air  Force 

8.3 

20.5 

<0.001 

Marine  Corps 

9.5 

11.5 

Coast  Guard  and  other 

2.2 

2.0 

Infants  with  birth  defects 

22.7 

3.3 

<0.001 

Infant  sex 

Male 

52.2 

51.4 

0.535 

Female 

47.8 

48.6 

*Limited  to  births  at  military  treatment  facilities. 

^Thirteen  subjects  lacked  full  parent  information  and  therefore  are  not 
included  in  this  table. 


step  1.  Third,  lists  were  generated  for  all  non-matching 
infants'  records  and  compared  with  all  possible  records 
contained  in  the  Registry  with  the  same  hospital  of  birth, 
and  a  near  match  on  date  of  birth.  From  these  lists,  final 
matches  between  the  2  data  sources  were  determined 
through  positive  name  matches  and  minor  differences  in 
other  variables,  such  as  date  of  birth. 

Statistical  Analysis 

Descriptive  statistics  and  measures  of  agreement, 
including  sensitivity,  specificity,  overall  agreement,  and 
the  kappa  statistic  were  used  to  compare  the  validation 
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and  Registry  samples  for  the  outcome  measures  of  interest. 
Subgroup  analysis  was  conducted  to  determine  if  measures 
of  agreement  varied  based  on  the  presence  of  a  birth  defect, 
parental  characteristics,  and  specific  hospital  factors.  For 
the  purposes  of  calculations,  the  information  in  the  valida¬ 
tion  sample  was  assumed  to  be  true  ("gold  standard"). 
Sensitivity  was  defined  as  the  probability  that,  given  a 
condition  is  present,  a  test  will  be  positive.  Specificity  was 
defined  as  the  probability  that,  given  a  condition  is  absent, 
a  test  will  be  negative.  Percent  agreement  was  calculated  as 
the  total  number  of  infants  classified  to  the  same  category 
from  both  the  validation  sample  and  the  Registry  sample, 
divided  by  the  total  number  of  infant  records  in  the  study. 
The  kappa  statistic  measures  agreement  between  data 
sources  above  what  is  expected  from  chance  alone.  A  kappa 
statistic  in  the  range  0.8-1. 0  represents  "near  perfect"  agree¬ 
ment,  0.6-0.8  "substantial"  agreement,  0.4-0.6  "moderate" 
agreement,  0.2-0.4  "fair"  agreement,  and  0.2-0.0  "slight  or 
poor"  agreement.24-26  All  statistical  analyses  were  performed 
using  SAS  software,  version  9.2  (SAS  Institute,  Inc.,  Cary, 
NC). 

Results 

The  validation  sample  included  1,858  records.  After 
removing  20  records  that  lacked  specific  birth  information, 
1,838  records  remained  in  the  validation  sample  for  possible 
matching  to  the  Registry  sample.  Among  the  remaining 
1,838  records  in  the  validation  sample,  1,669  (90.8%)  were 
successfully  matched  to  a  record  in  the  Registry  analytic 
database  and  were  used  for  analyses.  Among  the  169  non¬ 
matching  records,  151  records  matched  to  a  file  containing 
records  routinely  excluded  from  the  Registry  analytic 
-database  and  18  remained  unmatched,  possibly  due  to 
changes  in  identifying  information  after  the  infants'  births. 
Among  the  169  unmatched  validation  records,  there  were  53 
preterm  infants  and  47  low  birth  weight  infants.  Adjusting 
for  the  high  rate  of  same-sex  multiples  among  the  infants  in 
the  non-analytic  dataset,  these  rates  are  similar  to  rates  for 
the  1,669  validation  sample  records  (P=0.86  for  preterm  and 
P=0.08  for  low-weight  births). 

Table  1  shows  a  comparison  of  parental  demographic 
characteristics  for  validation  sample  infants  and  Registry 
births  from  1999-2002  at  MTFs.  Although  demographically 
similar,  more  parents  of  infants  in  the  validation  sample 
were  enlisted  rank  (86.0%  vs  81.9%),  unmarried  (10.9%  vs 
8.0%),  in  the  Navy  (38.9%  vs  25.3%),  and  less  likely  to  serve 
in  the  Marine  Corps  (9.5%  vs  11.5%)  or  Air  Force  (8.3% 
vs  20.5%).  A  larger  percentage  of  infants  in  the  validation 
sample  had  birth  defects  (22.7%  compared  to  3.3%). 

Table  2  shows  measures  of  agreement  and  compari¬ 
sons  between  the  validation  sample  and  Registry  sample 
for  EGA.  Agreement  was  "substantial"  and  higher  for  all 
comparisons  except  for  the  extreme  preterm  outcome  where 
agreement  was  "fair  to  moderate."  Shown  are  2  different 
cut  points  for  preterm  birth  to  illustrate  how  measures 
of  agreement  vary  based  on  slight  differences  in  possible 
research  criteria  in  the  36th  EGA  week  window  (252-258 
days).  Additional  analyses  did  not  reveal  differences  in 
measures  of  agreement  based  on  the  presence  or  absence 
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Table  2.  Gestational  Age  (GA)  Agreement  and  Comparison  Between  Validation  Sample  and  Registry  Sample 

GA 

(n-1669) 

— 

Validation 

Sample 

Registry  Number  in 

Sample  Agreement 

Percent 

Sensitivity 

Percent 

Specificity 

Percent 

Agreement 

95%  0 

Statistic * 

Full-term 

>37  weeks,  >259  days 

1484 

1504 

1467  98.9 

80.0 

96.8 

0.83 

0.78,  0.87 

Preterm+ 

<37  weeks,  <259  days 

185 

165 

148 

80.0 

98.9 

96.8 

0.83 

0.78,  0.87 

<36  weeks,  <252  days 

108 

165 

102 

94.4 

96.0 

95.9 

0.73 

0.66,  0.79 

Extremely  preterm 

<28  weeks,  <196  days 

10 

23 

8 

80.0 

99.1 

99 

0.48 

Validated  GA  Ranges 

GA  Categories** 

>37  wks 

<37  wks 
<259  df 

36  wks 
252-258  d 

35  wks 
245-251  d 

34  wks 
238-244  d 

32  to 
<34  wks 
224-237  d 

£28  to 
<32  wks 
196-223  d 

<28  wks 
<196  d 

VS,  n=1669 

1484 

185 

77 

38 

16 

21 

23 

10 

RS  full-term,  n=1504 

1467 

37 

31 

4 

0 

1 

1 

0 

RS  preterm,  n=1 42 

14 

128 

46 

34 

13 

17 

16 

2 

RS  extreme  preterm,  n=23 

3 

20 

0 

0 

3 

3 

6 

8 

GA  =  gestational  age;  Cl  =  confidence  interval;  VS  =  validation  sample;  RS  =  registry  sample;  wks  =  weeks;  d  =  days. 

*p  <  0.0001  for  all  kappa  statistics,  exact  P-values  computed  for  extremely  preterm,  95%  Cl  =  95%  confidence  interval  for  the  kappa  statistics,  not 
included  for  extremely  preterm  due  to  small  sample  size. 

tPreterm  category  is  not  mutually  exclusive  and  includes  all  births  occurring  less  than  the  defining  gestational  age,  including  those  extremely  preterm. 
♦♦Registry  data  from  ICD-9-CM  codes  765.0  ("extreme  preterm  infants,  <1000  g  and/or  <  28  completed  weeks")  and  765.1  (other  preterm  infants,  >28 
and  <37weeks  gestation);  lack  of  these  ICD-9-CM  codes  indicates  a  full-term  birth. 


of  a  birth  defect:  for  preterm  (<3 7  weeks,  <259  days),  the 
kappa  statistic  was  0.83  for  infants  both  without  and  with  a 
birth  defect  (95%  confidence  intervals:  0.77,  0.88,  and  0.75, 
0.91,  respectively).  Nor  were  there  significant  differences 
with  changes  in  military  parent  rank,  parent  branch  of  mili¬ 
tary  service,  or  hospital  specific  factors,  including  size  and 
services  (eg,  larger  medical  centers  with  neonatal  intensive 
care  vs  smaller  community  hospitals),  or  military  branch  of 
service  running  the  medical  facility  (data  not  shown). 

Also  shown  in  Table  2  are  EGA  frequency  counts  for 
subjects  matched  in  the  validation  sample  and  the  Registry 
sample.  The  Registry  misclassified  full-term  births  by  1.15% 
(17/1,484);  and  fully  83.8%  (31/37)  of  all  false-negative 
preterm  births  were  infants  born  between  36  and  37  weeks 
EGA  (252-258  days).  There  was  no  difference  in  EGA 
misclassification  rates  based  on  the  presence  or  absence  of 
a  birth  defect  (P=0.37).  The  Registry  sample  classified  23 
infants  as  extreme  preterm  births  compared  with  10  in  the 
validation  sample.  However,  the  1CD-9-CM  code  indicating 
extreme  preterm  birth  (765.0)  applies  to  infants  "less  than 
1000  grams  and/or  28  completed  weeks,"  thus,  using  both 
criteria  for  weight  and  EGA,  5  additional  infants  in  the 
validation  sample  can  be  included  in  this  category.  Three  of 
the  23  infants  categorized  in  the  Registry  sample  as  extreme 
preterm  were  noted  to  be  full  term  on  chart  review. 

Table  3  shows  measures  of  agreement  and  compares 
birth-weight  data  in  the  validation  sample  with  the  Registry 
sample.  Agreement  between  the  data  sources  is  "near 
perfect,"  as  indicated  by  kappa  statistic  values.  This  table 
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also  shows  that  ICD-9-CM  fifth-digit  codes  corresponding 
to  birth  weight  ranges  are  well  populated  in  the  Registry 
sample,  with  an  overall  accuracy  of  98.1%  (1,633/1,664). 
For  this  outcome,  0.26%  of  normal-weight  births  were 
misclassified  and  96%  (24/25)  of  false  negatives  were  in 
the  2000-2499  gram  category,  or  nearly  normal  weight.  Of 
interest,  there  were  101  infants  in  the  validation  sample  who 
were  both  preterm  and  low  birth  weight,  out  of  185  and  110, 
respectively  (data  not  shown). 

Discussion 

The  increased  risk  of  death  and  long-term  health 
complications  associated  with  preterm  birth  and  low  birth 
weight  makes  it  necessary  to  continue  surveillance  and 
research  of  these  important  outcomes.  Although  abstraction 
of  medical  records  is  the  preferred  method  for  assessing 
preterm  and  low-weight  births,  the  size  and  scope  of  most 
surveillance  programs  make  such  an  approach  cost  prohibi¬ 
tive  and  logistically  impossible,  as  data  extraction  requires 
significant  time  and  knowledge  from  medically  experienced 
individuals,  particularly  when  information  is  not  readily 
available  on  medical  summary  (face)  sheets.27'29  The  DoD 
Birth  and  Infant  Health  Registry  is  a  global  monitoring  and 
research  program  that  relies  on  ICD-9-CM  codes  obtained 
from  electronic  data  sources  to  assess  a  variety  of  reproduc¬ 
tive  health  outcomes.  These  analyses  demonstrate  that  the 
Registry  is  a  reliable  tool  for  assessing  preterm  birth  and 
low  birth  weight  in  a  large  and  geographically  diverse 
population. 
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Table  3.  Birth  Weight  Agreement  and  Comparison  Between  Validation  Sample  and  the  Registry  Sample 


Birth  Weight ,  in  grams 
(n  =1664)* 

Validation 

Sample 

Registry 

Sample 

Number  in 
Agreement 

Percent 

Sensivitivty 

Percent 

Specificity 

— 

Percent 

Agreement 

Kappa 
Statistic f 

95%  cr 

^2500  normal  weight 

1534 

1554 

1529 

99.7 

80.8 

98.2 

0.87 

0.82,  0.91 

<2500  low  weight** 

130 

110 

105 

80.8 

99.7 

98.2 

0.87 

0.82,  0.91 

<1 000  extremely  low 

13 

15 

13 

100 

99.9 

99.9 

0.93 

Birth  Weight  with  ICD-9-CM 
codes ,  validation  samples * 

Registry  Sample  Birth  Weight  Categories  hy  ICD-9-CM  Codes 

Validation 
Sample  totals 

764.x9 

764. x8 

764.x7 

764.X6 

764.x5 

764.  x4 

764. x3 

764. x2 

22500,  764.x9 

1529 

2 

0 

0 

2 

0 

0 

0 

1 

1534 

2000-2499,  764. x8 

24 

50 

0 

0 

0 

0 

0 

0 

0 

74 

1750-1999,  764.x7 

1 

0 

14 

0 

0 

0 

0 

0 

0 

15 

1 500-1 749,  764.x6 

0 

0 

0 

10 

0 

i 

0 

0 

0 

11 

1250-1499,  764.x5 

0 

0 

0 

1 

8 

0 

0 

0 

0 

9 

1000-1249,  764.X4 

0 

0 

0 

1 

0 

6 

1 

0 

0 

8 

750-999,  764.X3 

0 

0 

0 

0 

0 

0 

7 

0 

0 

7 

500-749,  764.x2 

0 

0 

0 

0 

0 

0 

0 

6 

0 

6 

<500,  764.X1 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

Registry  sample  totals 

1554 

52 

14 

12 

10 

7 

8 

1 _ 

Cl  =  confidence  interval. 

♦The  sample  size  for  birth-weight  comparisons  is  1664  because  5  reviewed  records  lacked  birth  weight  data.  All  birth  weights  are  in  grams. 

tP  <  0.0001  for  all  kappa  statistics,  exact  P  values  computed  for  extremely  low  birth  weight,  95%  Cl  =  95%  confidence  interval  for  the  kappa  statistic, 

not  included  for  extremely  low  birth  weight  due  to  small  sample  size. 

♦♦Includes  the  extremely  low  birth  weight  birth  records. 

4 Registry  data  from  ICD-9-CM  fifth  digit  codes:  764. xx,  765.0x  and  765.1  x  refer  to  specific  weight  ranges.  On  this  table  764. xx  codes  also  represent 
765. Ox  and  765.1  x  codes. 


The  reviewed  validation  sample  reflects  infants 
captured  in  the  Registry  born  at  MTFs  from  1999-2002,  with 
a  deliberate  oversampling  of  infants  with  birth  defects.  In 
analyses,  kappa  statistics  indicate  "near  perfect"  agreement 
for  all  outcomes  assessed  between  datasets,  except  for 
extreme  preterm  births,  where  our  sample  size  was  small 
and  agreement  was  only  "moderate."  In  addition,  when  the 
threshold  for  preterm  births  was  lowered  from  an  EGA  of 
<37  weeks,  to  <36  weeks,  the  sensitivity  increased,  speci¬ 
ficity  decreased,  and  total  agreement  and  kappa  statistics 
decreased  slightly.  Researchers  often  use  a  more  stringent 
threshold  for  preterm  births  in  an  effort  to  avoid  diag¬ 
nostic  misclassification  of  infants  born  near  term.30,31  This 
approach,  however,  is  not  appropriate  for  investigations 
using  data  derived  from  ICD-9-CM  codes,  where  diagnoses 
are  classified  by  EGA  and  birth-weight  ranges.  These 
analyses  validate  Registry  studies  using  the  standard  defini¬ 
tions  for  preterm  (<37  weeks)  and  low-weight  births  (<2500 
grams)  knowing  that  corresponding  ICD-9-CM  codes  accu¬ 
rately  reflect  medical  record  data. 

Limitations 

The  primary  limitation  is  that  this  study  compared 
agreement  only  for  births  occurring  in  MTFs,  thus  excluding 
the  40%  of  Registry  births  that  occurred  in  civilian  facilities, 
for  the  years  1999-2002.  This  study  is  also  limited  by  the 


omission  of  same-sex  multiples  from  Registry  analytical 
data  sets.  However,  many  investigations  of  preterm  and 
low  birth  weight  are  limited  to  singleton  births,  which 
somewhat  mitigates  this  limitation.  The  Registry  will  also 
not  account  for  infants  born  in  DoD  hospitals  if  their  mili¬ 
tary  parent  leaves  the  service  shortly  before  or  immediately 
after  birth.  In  these  situations,  military  medical  benefits 
would  continue  for  obstetric  care,  but  would  not  cover  the 
infant's  later  medical  care.  The  inability  to  match  infant 
records  could  also  occur  when  an  infant's  identifying  infor¬ 
mation  changes,  or  when  changes  occur  to  the  "official" 
military  sponsor  parent,  for  dual  military  parent  families, 
if  the  change  occurred  between  the  date  of  birth  and  the 
date  of  assignment  of  an  infant's  unique  medical  identi¬ 
fication  number.  These  infants  would  be  in  the  Registry, 
though,  for  this  study,  would  not  match  records  based  on 
selected  variables.  At  most,  these  latter  examples  represent 
18  unmatched  infant  records  from  the  available  records 
reviewed. 

Another  limitation  is  the  deliberate  oversampling  of 
infants  with  birth  defects  in  the  validation  sample.  Infants 
with  birth  defects  are  more  likely  to  be  preterm,  or  low  birth 
weight.910  As  a  result,  these  infants  require  more  medical 
care,  as  a  group,  and  therefore  have  more  opportunity  for 
ICD-9-CM  coding  of  any  preterm  or  low-weight  births. 
However,  these  results  show  no  differences  in  measures 
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of  agreement  based  on  the  presence  or  absence  of  a  birth 
defect.  Another  limitation  is  the  military  branch  of  service 
imbalance  in  this  study,  with  a  slightly  higher  percentage 
of  Navy  family  births  and  a  much  lower  percentage  of 
Air  Force  family  births.  Again,  there  were  no  measures  of 
agreement  differences  based  on  service  branch,  or  hospital 
specific  characteristics  between  data  sources.  Nevertheless, 
this  dissimilarity  in  the  composition  of  the  validation 
sample  and  the  Registry  sample  deserves  noting  as  there 
are  differences  in  the  military  services  with  respect  to  racial 
and  ethnic  composition  of  personnel,  and  the  proportions 
of  different  military  rank  and  personnel  education  levels, 
which  are  factors  shown  to  influence  birth  outcomes.2'3,32^0 

A  final  limitation  of  this  study  is  that  the  data  are  from 
1999-2002,  which  largely  predates  the  addition  of  the  fifth 
digit  to  ICD-9-CM  code  765.2x.  In  fiscal  year  2003,  the  fifth 
digit  was  added  to  765.2x  to  specify  EGA  week  ranges,  and 
it  provides  a  more  refined  demarcation  of  preterm  birth 
outcomes.  There  may  be  misclassification  for  EGA  due  to 
the  use  of  less  specific  codes  for  preterm  births  for  infants 
born  prior  to  this  ICD-9-CM  update.  Future  analyses  of 
preterm  birth  could  reduce  this  potential  bias  by  limiting 
the  study  population  to  infants  born  during  fiscal  year  2003 
or  later. 

Strengths 

The  principal  strength  of  this  investigation  is  the  large 
number  of  assessed  birth  records  coming  from  a  geographi¬ 
cally  diverse  selection  of  MTFs.  These  records,  matched  and 
compared  with  the  Registry  sample,  provide  statistics  on  the 
accuracy  of  ICD-9-CM-derived  information  in  the  Registry. 
Additionally,  a  weighted  adjustment  to  correct  for  the  high 
percentage  of  infants  with  birth  defects  in  the  validation 
sample  suggests  that  the  Registry  misclassifies  (over-identi- 
fies)  full-term  births  by  1.2%,  and  normal-weight  births  by 
0.26%,  for  MTF  births  from  1999-2002  (data  not  shown).  In 
our  sample,  the  majority  of  this  misclassification  occurred 
among  infants  between  36  and  37  weeks  of  gestation,  or 
nearly  full-term,  and  nearly  all  of  the  birth-weight  misclas¬ 
sification  occurred  among  infants  between  2000  and  2499 
grams,  or  nearly  normal  birth  weight.  This  finding  provides 
the  Registry  team  with  specific  parameters  for  conducting 
sensitivity  analyses  in  other  studies  involving  these  birth 
outcomes. 

Conclusion 

Public  health  efforts  to  improve  birth  outcomes  require 
surveillance  systems  that  capture  population-wide  data 
efficiently  and  accurately.  This  specific  study  improves 
knowledge  on  the  accuracy  and  completeness  of  data 
capture  using  ICD-9-CM  coding  for  preterm  and  low  birth 
weight  outcomes  in  the  over  1-million-subject  DoD  Birth 
and  Infant  Health  Registry  by  demonstrating  agreement 
between  the  Registry  and  medical  record  data.  Further,  it 
establishes  specific  misclassification  parameters  in  support 
of  other  Registry  studies  assessing  preterm  and  low  birth 
weight  outcomes.  These  results  also  provide  a  measure 
of  validity  for  investigators  that  rely  on  ICD-9-CM  code- 
derived  gestational  age  and  birth  weight  information. 
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